Transparent Recovery from Operating System Errors

نویسنده

  • Francis M. David
چکیده

Errors that occur in operating systems usually impact all user applications and may render a computer unusable. This is unfortunately the case even when the error only affects an operating system component that is not crucial to the functioning of most applications. CuriOS is a new operating system that uses lightweight distribution, isolation and persistence of state to transparently recover from errors and continue running existing applications. Errors are detected through techniques such as virtual memory protection, watchdog timers and checksums. Errors are fixed by restarting OS components without any loss of state.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recovering from Operating System Errors

User applications and data in volatile memory are usually lost when an operating system crashes because of errors caused by either hardware or software faults. This is because most operating systems are designed to stop operation when some internal errors are detected irrespective of the possibility that user data and applications might still be intact and recoverable. Techniques like exception...

متن کامل

Exploring Recovery from Operating System Lockups

Operating system lockup errors can render a computer unusable by preventing the execution other programs. Watchdog timers can be used to recover from a lockup by resetting the processor and rebooting the system when a lockup is detected. This results in a loss of unsaved data in running programs. Based on the observation that volatile memory is not affected when a processor a reset occurs, we p...

متن کامل

System Support for Software Fault Tolerance in Highly Available Database Management Systems

Today, software errors are the leading cause of outages in fault tolerant systems. System availability can be improved despite software errors by fast error detection and recovery techniques that minimize total downtime after an outage. This dissertation analyzes software errors in three commercial systems and describes the implementation and evaluation of several techniques for early error det...

متن کامل

Fail-safe PVM: A portable package for distributed programming with transparent recovery

Many scientific problems benefit from computations that are parallel at a coarse grain. Collections of looselycoupled, heterogeneous computers are increasingly being applied to these problems. While individual computers are designed to be relatively reliable, a collection of several autonomous machines necessarily has a greater rate of failure. As data networks improve, and larger multicomputer...

متن کامل

بررسی تراکم گاز بیهوشی نایتروس اکساید(O2N) در هوای اتاق‌های عمل جراحی و ریکاوری

Chronic exposure to N2O environmental pollution may influence the health of personnel working in operating and recovery rooms. Human studies have indicated that chronic exposure to N2O may decrease mental performance, audiovisual ability, and manual dexterity and may also cause adverse reproductive effects like reduced fertility, spontaneous abortion and neurological, renal, and liv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007